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ABSTRACT 



A group is Markov if it admits a prefix-closed regular language 
of unique representatives with respect to some generating set, 
and strongly Markov if it admits such a language of unique 
minimal-length representatives over every generating set. This 
paper considers the natural generalizations of these concepts to 
semigroups and monoids. Two distinct potential generalizations 
to monoids are shown to be equivalent. Various interesting ex- 
amples are presented, including an example of a non-Markov 
monoid that nevertheless admits a regular language of unique 
^ ■ representatives over any generating set. It is shown that all 

finitely generated commutative semigroups are strongly Markov, 
O " but that finitely generated subsemigroups of virtually abelian or 

polycyclic groups need not be. Potential connections with word- 
hyperbolic semigroups are investigated. A study is made of the 
I interaction of the classes of Markov and strongly Markov semi- 

groups with direct products, free products, and finite-index sub- 
semigroups and extensions. Several questions are posed. 

X: 
S: 

1 INTRODUCTION 

The notion of Markov groups was introduced by Gromov in his 
seminal paper on hyperbolic groups [Gro87, § 5.2], and explored further by 
Ghys & de la Harpe [GdlHgoa]. A group is Markov if it admits a language 
of unique representatives, with respect to some generating set, that can be 
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described by a Markov grammar. In this context, a Markov grammar is es- 
sentially a finite state automaton with one initial state and every state being 
an accept state. The connection with hyperbolic groups arises because every 
hyperbolic group admits such a language of minimal-length unique represen- 
tatives; such groups are said be strongly Markov [GdlH^oa, Theoreme 13]. 
Strongly Markov groups have rational growth series with respect to any gen- 
erating set [GdlH^oa, Corollaire 14]. 

The overarching aim of this paper is to begin to investigate the natural 
generalization to semigroups of this notion of Markov groups. A motivation 
for this is the fruitful generalization from groups to semigroups of concepts 
involving automata and languages, such as automatic structures (for groups, 
see [ECH + 92], for semigroups, [CRRT01]), automatic presentations (see, for 
example, [OT05, CORT09]), and automaton semigroups (for groups, see the 
monograph [Neko5], for semigroups, see for example [Makx), SS05]). 

After recalling some necessary background definitions and results in § 2, 
the generalization of the definition to monoids and semigroups is given in § 3. 
The generalization to monoids is immediate: a Markov monoid is a monoid 
admitting a language of unique representatives described by a Markov gram- 
mar (again, essentially a finite state automaton with a unique initial state and 
every state being an accept state), which is equivalent to admitting a prefix- 
closed regular language of unique representatives (see Proposition 3.1 below). 
A monoid is strongly Markov if it admits a prefix-closed language of unique 
minimal-length representatives with respect to any generating set. However, 
since the empty word is not in general a valid representative for an element of 
a semigroup, generalizing the definition to semigroups entails excluding the 
empty word from the otherwise prefix-closed language of unique representa- 
tives. Thus there are, for monoids, distinct notions of 'Markov as a monoid' 
and 'Markov as a semigroup'; fortunately, the concepts turn out to be equiva- 
lent, as proved in § 4. 

Some of the basic properties of Markov semigroups are explained in § 
5. An example of a non-Markov monoid that nevertheless admits a regu- 
lar (non-prefix-closed) language of unique representatives with respect to any 
generating set is given in § 6. How certain rewriting systems naturally give 
rise to Markov semigroups is shown in § 7. That finitely generated com- 
mutative semigroups are strongly Markov is shown in § 9. Next, § 10 shows 
that finitely generated subsemigroups of polycyclic or virtually abelian groups 
need not be Markov, and discusses the importance of these facts. § 11 exhibits 
some other interesting examples of Markov semigroups and some examples 
of non-Markov semigroups. 

Given the intimate connection between hyperbolic groups and Markov 
groups discussed above, it is natural to look for a parallel between semi- 
groups that are word-hyperbolic in the sense of Duncan & Gilman [DG04] 
and Markov semigroups. However, as discussed in § 12, a word-hyperbolic 
semigroup need not even admit a regular language of unique normal forms, 
let alone a prefix-closed one. 

§§ 13-16 examine the interaction of Markov semigroups with adjoining 
identities and zeros, with direct products, with free products, and with finite- 
index subsemigroups and extensions. Finally, the class of languages that are 
Markov languages for semigroups is considered in § 17. 

Since Markov semigroups seem to be an entirely new area, there are many 
possible directions for further research. Consequently, various open questions 
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are scattered throughout the paper in the relevant contexts. 

We remark that the research described in this paper has involved draw- 
ing techniques, ideas, and examples from a broad swathe of semigroup and 
formal language theory. 



2 PRELIMINARIES 

2.1 Generators, alphabets, and words 

The notation used in this paper distinguishes a word from the 
element of the semigroup or monoid it represents. Let A be an alphabet rep- 
resenting a set of generators for a semigroup or monoid S. Formally, there is 
a map cj) : A — >• S that extends to a surjective homomorphism <£> : A + — > S (or 
4> : A* — > S if S is a monoid). 

While occasionally the representation map c}) will be explicitly mentioned, 
generally the following notational distinction will suffice: for a word w G A*, 
denote by W the element of M represented by w (so that w = wcf)); for a set 
of words W C A*, denote by W the set of all elements of S represented by at 
least one word in W. Notice that the emptyword e is a valid representative 
word if and only if S is a monoid. 

2.2 Languages and automata 

For background information on regular and context-free lan- 
guages and finite automata, see [HU79, Ch. 2-4]. 

Let L be a language over an alphabet A. Then L is prefix-closed if 

(Vu G A*,v G A+)(uv G L ==> u G L), 

and L is closed under taking non-empty prefixes, or more succinctly ^-prefix-closed, 
if 

(Vu G A+,v G A + )(uv G L =► u G L). 
Notice that if L is prefix-closed and non-empty, it contains the empty word e. 

2.3 String-rewriting systems 

This subsection contains facts about string rewriting needed later 
in the paper. For further background information, see [BO93]. 

A string rewriting system, or simply a rewriting system, is a pair (A, 3J), where 
A is a finite alphabet and 3? is a set of pairs (£, r), known as rewriting rules, 
drawn from A* x A*. The single reduction relation is defined as follows: 
u => v (where u, v G A*) if there exists a rewriting rule (£, r) G 3? and words 
x, y G A* such that u = xly and v = xry. That is, u =>■ v if one can obtain 
v from u by substituting the word r for a subword i of u, where (£, r) is a 
rewriting rule. The reduction relation =>* is the reflexive and transitive closure 
of =>. The process of replacing a subword I by a word r, where [t, r) G % is 
called reduction, as is the iteration of this process. 

A word w 6 A* is reducible if it contains a subword I that forms the left- 
hand side of a rewriting rule in 3?; it is otherwise called irreducible. 

The string rewriting system (A, 31) is noetherian if there is no infinite se- 
quence ui,U2,... G A* such that U| => u^ + i for all i G N. That is, (A, 3£) 



3 



is noetherian if any process of reduction must eventually terminate with an 
irreducible word. The rewriting system (A, 31) is confluent if, for any words 
u,u',u" G A* with u =>* u' and u =>* u", there exists a word v G A* such 
that u' =>* v and u" =>* v. 

The string rewriting system (A, 31) is non-length-increasing if (£, r) G 31 im- 
plies that |£| ^ |r| and is length-reducing if (£,r) G 31 implies that \l\ > \r\. 
Observe that any length-reducing rewriting system is necessarily noetherian. 

The rewriting system (A, 31) is monadic if it is length-reducing and the right- 
hand side of each rule in 31 lies in A U {e}; it is special if it is length-reducing 
and each right-hand side is the empty word e. Observe that every special 
rewriting system is also monadic. 

The string rewriting system (A, 31) is finite if the set of rules 3? is finite. 
A monadic rewriting system (A, 31) is regular (respectively context-free), if, for 
each a G A U {e}, the set of all left-hand sides of rules in 3? with right-hand 
side a is regular (respectively, context-free). 

Let (A, 3?) be a confluent noetherian string rewriting system. Then for any 
word u G A*, there is a unique irreducible word v G A* with u =>* v [BO93, 
Theorem 1.1.12]. The irreducible words are said to be in normal form. The 
monoid presented by (A | 31) may be identified with the set of normal form 
words under the operation of 'concatenation plus reduction to normal form'. 



3 DEFINITIONS 

As defined by Ghys & de la Harpe [GdlHgoa, Definition 4], a 
group is Markov if it admits a language of unique representatives defined by 
a Markov grammar, which is essentially a finite state automaton where every 
state is an accept state [GdlHgoa, Definition 1]. The following result shows 
that the class of languages recognized by such automata are the prefix-closed 
regular languages. In general, arguments in this paper work with regular 
expressions rather than explicitly constructed automata, so this equivalences 
embodied in this result and in the later Proposition 3.4 are important. 

Proposition 3.1. A regular language is prefix-closed if and only if it is recognized 
by a finite state automaton in which every state is an accept state. 

Proof of 3.1. Suppose L is prefix-closed and let A be a trim deterministic finite 
state automaton recognizing L. Let q be some state of A. Since A is trim, q lies 
on a path from the initial state to an accept state. Let w be the label on such a 
path, with w' being the label before the first visit to q. Then w', being a prefix 
of w, also lies in L. Since A is deterministic, there is only one path starting at 
the initial state labelled by w', and this path ends at q. Since w' G L, it follows 
that q is an accept state. Therefore, since q was arbitrary, every state of A is 
an accept state. 

Suppose that L is accepted by an automaton A in which every state is 
an accept state. Let w G L and let w' be some prefix of w. Then w labels 
a path starting at the initial state of A and leading to an accept state. The 
prefix w' labels an initial segment of this path, ending at a state q, which, by 
hypothesis, is also an accept state. Thus w' G L. Since w G L was arbitrary, L 
is prefix-closed. [p] 

In light of Proposition 3.1, a group is Markov if it admits a prefix-closed 
regular language of unique representatives. Now, in generalizing the notion 
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of being Markov from groups to semigroups, one must change from monoid 
to semigroup generating sets and modify the notion of the language of repre- 
sentatives appropriately. For groups, the language of representatives is taken 
over an alphabet representing a monoid generating set for the group, with the 
empty word being the representative of the identity. (Indeed, the empty word 
lies in any non-empty prefix-closed language.) In generalizing to arbitrary 
semigroups, it is necessary to use a semigroup generating set, in which case 
the empty word is no longer admissable as a representative, and the natural 
definition for the language of representatives requires not prefix-closure, but 
only +- prefix-closure. 

This raises a potential problem, in that a monoid (possibly a group) could 
be Markov in two different ways: it could be Markov as a monoid (allowing, 
or rather requiring, that the identity be represented by the empty word), or 
Markov as a semigroup (requiring that the identity be represented by a non- 
empty word). It is thus conceivable that the class of monoids that are Markov 
as monoids and the class of monoids that are Markov as semigroups are dis- 
tinct. Fortunately, however, the two notions are equivalent, as will be shown 
in §4. 

The definition of 'Markov as a monoid' is given first, since it is the more 
direct generalization from the group case: 

Definition 3.2. Let M be a monoid and let A be a finite alphabet representing 
a monoid generating set for M. For x G M, let Aa(x) be the length of the 
shortest word over A representing x; this is called the natural length of x. 
(Notice that A(1 M ) = 0.) 

A monoid Markov language for M over A is a regular language L that is 
prefix-closed and contains a unique representative for every element of M. 

A robust monoid Markov language for M over A is a regular language L that 
is prefix-closed and contains a unique representative for every element of M 
such that |w| = Aa(w) for every w G L. 

The monoid M is Markov (as a monoid) if there exists a monoid Markov 
language for M over an alphabet representing some monoid generating set for 
M. 

The monoid M is robustly Markov (as a monoid) with respect to an alphabet 
A representing a generating set for M if there exists a robust monoid Markov 
language for M over A. 

The monoid M is strongly Markov (as a monoid) if, for every alphabet A 
representing a monoid generating set for M, there exists a robust monoid 
Markov language for M over A. 

The reason for introducing the term 'robustly Markov' is because there 
are many natural examples of semigroups that admit a Markov languages 
of minimal-length representatives while not being strongly Markov (see for 
example Proposition 7.1), and consequently such semigroups still enjoy certain 
pleasant properties. 

Note that Ghys & de la Harpe [GdlHcjoa] use different terminology: rather 
than 'Markov (respectively, strongly Markov) groups', they use (terms that 
translate as) 'groups with the Markov (respectively, strong Markov) prop- 
erty'. We prefer Gromov's original terminology, since it does not clash with 
'Markov property' in the sense of an undecidable semigroup-theoretic prop- 
erty (see [Mar5i] and [BO93, Theorem 7.3.7]). 
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Definition 3.3. Let S be a semigroup and let A be a finite alphabet represent- 
ing a generating set for S. For x G S, let AaM be the length of the shortest 
non-empty word over A representing x; this is called the natural length of x. 
(Notice that if S is a monoid, AaHs) is not zero.) 

A semigroup Markov language for S over A is a regular language L that 
does not contain the empty word, is +-prefix-closed, and contains a unique 
representative for every element of S. 

A robust semigroup Markov language for S over A is a regular language L that 
does not contain the empty word, is +-prefix-closed, and contains a unique 
representative for every element of S such that |w| = Aa(w). 

The semigroup S is Markov (as a semigroup) if there exists a semigroup 
Markov language for S over an alphabet representing some generating set for 
S. 

The semigroup S is robustly Markov (as a semigroup) with respect to an al- 
phabet A representing a generating set for S if there exists a robust semigroup 
Markov language for S over A. 

The semigroup S is strongly Markov (as a semigroup) if, for every alphabet 
A representing a generating set for S, there exists a robust semigroup Markov 
language for S over A. 

The following result is the parallel of Proposition 3.4 that applies to +- 
prefix-closed languages: 

Proposition 3.4. A regular language that does not contain the empty word is +- 
prefix-closed if and only if it is recognized by a finite state automaton in which every 
state except the initial state is an accept state, and in which there are no incoming 
edges to the initial state. 

Proof of 3.4. Suppose L is +- prefix-closed and does not contain the empty 
word. Let A be a trim deterministic finite state automaton recognizing L. 
Since L does not contain the empty word, the initial state qo is not an accept 
state. Let q be some other state of A. Since A is trim, q lies on a path from the 
initial state to an accept state. Let w be the label on such a path, with w' ^ e 
being the label before the first visit to q. Then w', being a non-empty prefix 
of w, also lies in L. Since A is deterministic, there is only one path starting 
at the initial state labelled by w', and this path ends at q. Since w' € L, it 
follows that q is an accept state. Therefore, since q was arbitrary, every state 
of A is an accept state. Finally, suppose, with the aim of obtaining a contra- 
diction, that there is an incoming edge from a state p to the initial state qo- 
Then, since A is trim, there is a word w labelling a path from qo to an accept 
state, including this edge from p to qo- Let w' be the prefix of w labelling the 
non-empty initial segment of the path from qo back to qo- Then, since qo is 
not an accept state and A is deterministic, w' £ L, contradicting the fact that 
L is +-prefix-closed. Hence there are no edges ending at qo- 

Suppose that L is accepted by an automaton A in which every state except 
the initial state is an accept state, and in which the initial state has no incoming 
edges. Let w € L and let w' be some prefix of w. Then w labels a path starting 
at the initial state of A and leading to an accept state. The prefix W labels 
an initial segment of this path, ending at a state q, which cannot be the initial 
state, since it has no incoming edges, and must therefore, by hypothesis, be an 
accept state. Thus w'eL Since w e L was arbitrary, L is prefix-closed. [34] 
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4 MARKOV MONOIDS 

As remarked in § 3, it is conceivable that the class of monoids 
that are Markov as monoids and the class of monoids that are Markov as 
semigroups are distinct, and the same issue arises for being robustly Markov 
and strongly Markov. Fortunately for monoids the monoid and semigroup 
notions are equivalent, as the following three results show: 

Proposition 4.1. A monoid is Markov as a semigroup if and only if it is Markov as 
a monoid. 

Proof of 4.1. Let M be a monoid. 

Suppose that M is Markov as a monoid. Let A be an alphabet representing 
a monoid generating set for M such that there is a monoid Markov language 
L for M over A. Then L is prefix-closed, regular, and contains a unique repre- 
sentative for each element of M. In particular, the identity of M is represented 
by e € L. Let 1 be a new symbol representing the identity for M. Then 
K = (L — {e}) U {1} is +- prefix-closed, regular, and contains a unique represen- 
tative for every element of M. Hence K is a semigroup Markov language for 
M and thus M is Markov as a semigroup. 

Suppose now that M is Markov as a semigroup. Let A be an alphabet 
representing a semigroup generating set for M such that there is a semigroup 
Markov language L for M over A. Then L is +- prefix-closed, regular, and 
contains a unique representative for every element of M. Let w be the unique 
word in L representing the identity of M. Let 

K = (L - wA*) U {u € A* : wu G L}. 

Since L is +- prefix-closed and wA* is closed under concatenation on the right, 
L — wA* is also +-prefix closed. Furthermore, {u G A* : wu G L} is prefix- 
closed. (Notice that this set contains e since w lies in L.) So K is prefix-closed. 
Moreover, wu and u represent the same element of M for any u G A*, so 
{u G A* : wu G L} consists of unique representatives for exactly those elements 
of M whose representatives in L have w as a prefix. Hence every element of 
M has a unique representative in K. Finally, notice that K is regular. Thus K is 
a monoid Markov language for M and so M is Markov as a monoid. [p] 

Proposition 4.2. 1. If a monoid is robustly Markov as a monoid with respect to 
some alphabet A representing a semigroup generating set, it is also robustly 
Markov as a semigroup with respect to A. Furthermore, if a monoid is robustly 
Markov as a monoid with respect to an alphabet B representing a monoid generat- 
ing set that is not also a semigroup generating set, then it is robustly Markov as a 
semigroup with respect to B U {1}, where 1 represents the identity. 

2. If a monoid is robustly Markov as a semigroup with respect to some alphabet A 
representing a (semigroup) generating set, then it is robustly Markov as a monoid 
with respect to A. Furthermore, if a monoid is robustly Markov as a semigroup 
with respect to BU{1}, where B represents a monoid generating set and 1 represents 
the identity, then it is robustly Markov as a monoid with respect to B. 

Proof of 4.2. Let M be a monoid. 

1. Suppose that M admits a robust monoid Markov language L over A. Since 
A generates M as a semigroup, one can choose a shortest non-empty word 
w over A representating the identity of M. Let w = W] • • • w n , with W| G A. 
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For each non-empty prefix wi • • • Wi of w, let pi be the unique element of 
L representing the same element of M as this prefix. Notice that if an 
element of L has a prefix representing wi • • • Wi, that prefix must be pi 
by the prefix-closure of L and the fact that it maps bijectively onto M. 
Moreover, the length of pi must be the same as the length of wj •••w i . To 
find a robust semigroup Markov language for M over A, it is necessary 
to replace the prefixes pi by wi • • • w; and the empty word e by w. More 
formally let 

TL TL 

K= ((L -{ £ })- IJpiA*) U{w}U (J{ Wl ■■■w i u:p i uGL}. 
i=i i=i 

Now, L — {e} is +-prefix-closed. Since each language PiA* is closed under 
concatenation on the right, 

TL 

(L - { £ }) - (J Pi A* 
i=i 

is +-prefix-closed. Furthermore, 

TL 

{w} U {wi • • • WiU : ptu € L} 
i=i 

is +- prefix-closed since L is and since every prefix of w is in this set. There- 
fore K is +- prefix-closed. Furthermore, K is regular and, by definition, 
maps bijectively onto M. Finally, since |pi| = |wi ■ • • Wi\, it follows that the 
representative in K of an element of M is the same length as its representa- 
tive in L, excepting that the identity is represented by the non-empty word 
w in K. So K is a robust semigroup Markov language over A for M. 

For the final claim, let L be a robust monoid Markov language for M 
over B. Then 1 is a shortest non-empty representative of 1m over the 
alphabet B U {1}. Then K = (L — {e}) U {1} is a regular, +-prefix-closed, and 
consists of minimal-length unique representatives for M. So K is a robust 
semigroup Markov language for M. 

2. Suppose that M admits a robust semigroup Markov language L over an 
alphabet A representing a semigroup generating set for M. 

Let w e L be the representative of the identity of M. Since L does not 
contain the empty word, |w| ^ 1 . Suppose that some word u e L contains 
w as a proper subword, with u = u'wu". Then u'u" = u and \u'u"\ < 
|u|, which contradicts the fact that representatives in L are supposed to 
be length-minimal. So w is not a proper subword of any word in L. In 
particular, L' = L — {w} is +-prefix-closed. 

Notice that V is +-prefix-closed, regular, and consists of unique repre- 
sentatives having minimal length (over A) for non-identity elements of M. 
Thus K = L ; U{e} is prefix-closed, regular, and consists of unique represen- 
tatives for all elements of M. So K is a robust monoid Markov language 
over A for M. 

For the final claim, let A = B U {1 } and follow the same reasoning. In 
this case, 1 is the minimal-length representative for 1 m and does not occur 
as a subword of any other element of L. So L' C B + and so K is a robust 
monoid Markov language over B for M. [4^2] 
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The following result is a consequence of Proposition 4.2: 

Proposition 4.3. A monoid is strongly Markov as a semigroup if and only if it is 
strongly Markov as a monoid. 

Proof of 4.3. Let M be a monoid. 

Suppose M is strongly Markov as a monoid. Let A be an alphabet repre- 
senting a semigroup generating set for M. Then M is robustly Markov as a 
monoid with respect to A. By the first part of Proposition 4.2, M is robustly 
Markov as a semigroup with respect to A. Since A was an arbitrary alphabet 
representing a semigroup generating set for M, by definition M is strongly 
Markov as a semigroup. 

Suppose M is strongly Markov as a semigroup. Let B be an alphabet 
representing a monoid generating set for M. Then M is robustly Markov as 
a semigroup with respect to B U {1}, where 1 = 1m- By the second part of 
Proposition 4.2, M is robustly Markov as a monoid with respect to B. Since 
B was an arbitrary alphabet representing a monoid generating set for M, by 
definition M is strongly Markov as a monoid. [43] 

In light of Propositions 4.1, 4.2, and 4.3, there is no need for a termi- 
nological distinction between the conditions 'Markov as a semigroup' and 
'Markov as a monoid', between 'robustly Markov as a semigroup' and 'ro- 
bustly Markov as a monoid', and between 'strongly Markov as a semigroup' 
and 'strongly Markov as a monoid': the terms 'Markov', 'robustly Markov', 
and 'strongly Markov' alone will suffice. 

The results in this section parallel the situation for automatic monoids: 
a monoid is automatic as a semigroup if and only if it is automatic as a 
monoid [DRR99, §5]. 



5 BASIC PROPERTIES 

It is important to note that a Markov language does not define a 
group or semigroup up to isomorphism, unlike an automatic structure [KO06, 
Proposition 2.3]. To see this, notice that if A is a finite alphabet of size n, then 
A (qua language of one-letter words) is a semigroup Markov language for any 
semigroup of size n, and AU{e} is a monoid Markov language for any monoid 
or group of size n+ 1. The language (a* U (a -1 )*)(b* U (b _1 )*)(c* U (c^ 1 )*) 
is a Markov language for both 1? and the Heisenberg group [Ghycjo, § 5.2]. 

The growth series of a semigroup S with respect to a finite alphabet A rep- 
resenting a generating set for S is 

I(S,A) =^x Aa(x) , 
ses 

or equivalently 

00 

I(S,A) = ^a A [n)x n , 

n=0 

where = |{s e S : Aa(s) = n}|. A growth series I(S,A) is said to be 

rational if it is a power series expansion of a rational function. 

Theorem 5.1. If a semigroup admits a robust Markov language with respect to a 
particular generating set, then its growth series with respect to that generating set 
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is a rational function. A strongly Markov semigroup has rational growth series with 
respect to any generating set. 

Proof of 5.1. The proof for groups generalizes directly [GdlHcjoa, Corollaire 14]. 

HE 

The independent importance of semigroup growth series (see, for example, 
[GdlH97, § 4]) means that, as a consequence of Theorem 5.1, robust Markov 
semigroups are of considerably greater interest than Markov semigroups gen- 
erally. 

Remark 5.2. It is worth observing that the growth rate of a Markov language 
need not mirror the growth of the semigroup or monoid. For example, all 
finitely generated polycyclic groups are Markov [GdlHc^oa, Corollaire 11]. Fur- 
thermore, the language of collected words for a finitely generated polycyclic 
group forms a Markov language [Sim94, p. 395] and is easily seen to have 
polynomial growth. However, a polycyclic group that is not virtually nilpo- 
tent contains a free subsemigroup of rank 2 [R0S74, Theorem 4.12] and hence 
has exponential growth. 

Being Markov implies the existence of a regular language of unique normal 
forms over any finite generating set: 

Proposition 5.3. Let S be a semigroup that admits a regular language of unique 
normal forms over some generating set (such as a Markov semigroup), and let A he a 
finite alphabet representing a generating set for S. Then there is a regular language L 
over A such that every element ofS has a unique representative in L. 

[Notice that even if S is a Markov semigroup, the language L need not be 
prefix-closed.] 

Proof of 5.3. Let K be a regular language of unique normal forms for S over 
some finite alphabet B. For each b € B, let Ub G A + be such that u b represents 
b. Let R C B + x A + be the rational relation: 

R = {(bi ,u bl ) (b 2 , u b2 ) • • • (b n , u bn ) : b G B, n 6 N} 

Notice that if (v,w) 6 R, then v = w. 
Let 

L = K o R = {w G A* : (3v G K)((v, w) G R)}; 

observe that L is a regular language. Notice that, by the definition of R, for 
each word v in K there is exactly one word w G L with (v, w) G R. Since for 
each x G S there is exactly one word v in K with v = x, it follows that there is 
exactly one word w G L with w = x. That is, the language L maps bijectively 
onto S. [H] 



6 A NON-MARKOV MONOID WITH A REGULAR SET OF UNIQUE 
REPRESENTATIVES 

This section exhibits a non-Markov monoid that nevertheless ad- 
mits a regular language of unique representatives over any alphabet repre- 
senting a finite generating set. (That is, regularity and uniqueness of repre- 
sentatives is achievable over any alphabet representing a generating set, but 
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Figure 1: An outline of the graph of the action of X on T. 

prefix-closure is never achievable.) This is important because it shows that 
the classes of Markov semigroups and monoids are properly contained in the 
classes of semigroups and monoids admitting regular languages of unique 
normal forms: the requirement of prefix-closure properly restricts the classes 
under consideration. 

The example depends on the following construction from [MR, § 5]. 

Definition 6.1. For any action of a semigroup S on a set T, define a new 
semigroup S[T] as follows. The carrier set is S U T; multiplication in S remains 
the same, and for s G S and x, y 6 T, 



sx = x, 



XS = X • s, 



*y = y- 

It is straightforward to check that this multiplication is associative. 

To construct the example, proceed as follows. Let F and F' be free monoids 
with bases X = {x,y} and X' = {x',y '} respectively and let 

R = {w G F' : \w\ y i is even}. 

Let wo,W] , W2, ... be the elements of R enumerated in length-plus-lexicographic 
order. Define \\> : NU{0} — >• R by j H> wj, so that ^ is a bijection between NU{0} 
and R. Notice that H>| < V for all j G N U {0}. Let 

P = { Pi : i G N}, 

Q={ qi :iGNA-(3j GNU{0})(t = 2')}, 
T = PUQUF ; U {D.}. 

Define an action of the generators x and y on the set T as follows: 



Pi -y 



pi+1, 

fqt if i^2i for any j G NU{0}, 



if i = 2\ 

qt • x = O, w • x = wx' (for w G F'), O • x = O, 

qi-y = 0, w-y =wy' (for w G F')> O • y = O. 

Figure 1 illustrates the graph of the action of X on T. Since F is free on X, this 
action extends to a unique action of F on T. 
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The aim is to show that F[T] is not Markov but nevertheless admits a regu- 
lar language of unique representatives over any finite alphabet representing a 
generating set. 

Notice that in F[T], elements of F multiply as in the free monoid and act on 
T. Elements of F' are members of the set T and thus multiply like right zeroes. 

Proposition 6.2. The monoid F[T] admits a regular language of unique representa- 
tives over any finite alphabet representing a generating set. 

Proof of 6.2. By Proposition 5.3, it suffices to prove that F[T] admits a regular 
language of unique representatives over some particular finite alphabet repre- 
senting a generating set. 

Let A = {a, b, c, d, e, f}, where a = x, b = y, c = x', d = y' , e = pi, and 
f = O. Let p : F' — > A + be the bijection extending x' h-> c and y' h-> d. Let 

L = {a, b}* U ea* U ea*b U ({c, d} + - Rp] U {f}. 

Then L maps bijectively onto F[T]. In particular, the subset {a, b}* maps bijec- 
tively onto F, the subset ea* maps bijectively onto {pi : i G N}, the subset ea*b 
maps bijectively onto {qt : i e N} U R, and the subset {c, d} + — Rp maps bijec- 
tively onto F' — R. So L C A* is a regular language of unique representatives 
for F[T]. [Note that L is not prefix-closed, since it does not contain words from 
Rp but does contain all words in (Ry')p = (Rp)d.] \el\ 

Proposition 6.3. The monoid F[T] is not Markov. 

Proof of 6.3. Suppose, with the aim of obtaining a contradiction, that F[T] ad- 
mits a Markov language L over some alphabet A. 

Informally, the strategy is to reach a contradiction by proving the follow- 
ing: 

x. Sufficiently long elements of R must have representatives in L that label 
paths that run through P for most of their length (excepting a short prefix) 
and enter R C F' on their last letter. (Lemma 6.5.) 

2. Sufficiently long elements of F' — R have representatives in L that label 
paths that run through F' for most of their length (excepting a short prefix). 
(Lemma 6.6.) 

3. Taking a suitable prefix of a representative of an element of F' — R yields a 
representative of an element of R that is not of the form described in step 1. 
(Conclusion of proof.) 

As as preliminary, define several subalphabets of A and several constants 
that will be used later to clarify what 'sufficiently long' means in the plan 
above. Let 



A P 


= {a G A: 


aeP}, 


A Q 


= {a G A: 


QG Q}, 


A F 


= {a£ A: 


a G F'}, 


A F 


= {a G A: 


aGF}, 


A x 


= {aG A: 


a G x + }, 


A a 


= {a G A: 


a = Q}; 
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notice that A is the disjoint union of Ap, Aq, Ap, Ap and Aq, and that 
A x C A F . Let 

mi = |u|, where u is the unique representative in L of O, 
m.2 = max{i : pt G Ap}, 
m 3 = max{i : q t G Aq}, 
m.4 = max{|a| : a G Ap/}, 
m = max{mi , m.2, m.3 , m.4}. 

Let k = max{|a| : a G A F }. 

Let A be a deterministic finite automaton recognizing L. Consider the set 
of labels on simple loops in A. Let V be the set of such labels that lie in A*. 
Let n be a constant that is a multiple of all of the lengths of the elements of V 
and that also exceeds the number of states in A. 

Lemma 6.4. Let uav G L, where a G A — Ap. Then |u[ < n. That is, any letter 
from Ap U Aq U Ap U Aq in a word in L must lie in the first n letters, and hence 
L C A^ n A*. 

Proof of 6.4. Suppose for reductio ad absurdum that uav G L is as in the hypoth- 
esis but that |u| > n. Then by the pumping lemma, u factorizes as u'u"u"' 
such that u'(u") a u"'av G L for all a G N U {0}. Since a G T, it follows from 
the definition of multiplication in F[T] that 

u'(u") a u'"av = av, 

for every a G N U {0}, which contradicts the uniqueness of representatives in 
L. Hence |u| ^ n. [64 

Lemma 6.5. The representative in L of every w G R C F' wzf/z |w| > m + n + k + kn 
/zas the form vc, where v G A*, c £ Ap- A X/ v G P and c = x^y /or some (3 < k. 

Proof 0/ 6.5. Let j be such that = w. Since |w| > m+n+k+kn, it follows that 
2' > |w| > m + n and hence 2 J — n > m. It also follows that V > n + kn ^ 2n, 
and so n < 2' -1 . Hence 2' — n > 2^ _1 . Thus 2' — n is not a power of 2 and so 
there is an element q 2 j tl e Q- 

Let t be the representative in L of q 2 j_ n . Since 2' — n > m, the rightmost 
letter a from A — Ap in the word t cannot be such that a = q 2 i tv by the 
definition of m; therefore a must lie in Ap. By Lemma 6.4, t factorizes as uas, 
where |u| < n and s G Ap Let s = s'cs", where s' G A* and c G Ap — A x . 
(Such a letter c must exist, otherwise ubs G P.) Now, uas'c G Q. Since the 
action of F on any element of Q leads to the sink element O, it follows that s" 
is the empty word. Hence t = uas'c. 

Let c = x^uz, where z G {x, u}*. Then uas'x 13 !) G Q, and so z = e since 
otherwise uas'x^yz = O. Since |c| < k, it follows a fortiori that (3 < k. 

Furthermore, since uas'c = q2i_ n , it follows that uas' = P2)_ n _g. Hence, 
since ua = a = p m / for some m' < m, it follows that 

Is 7 ! = 2 j - n - (3 - m' > 2 j - n - k - m > kn. 

Thus |s'| > n since each letter of s' represents an element of F whose length is 
at most k. 



13 



Thus by the pumping lemma s' factorizes as v'v"v"', where |v"| divides n 
(by th e definition of n) and v'(v")°V" G L for all a G NU{0}. Set a = n/|v"|+1. 
Then UQv'(v") a v'" =P2i— 6- Thus 

uav'(v") a v" / c = P2i-(3 xl3 y = P2iy = Vli^ = w - 

Set v = uav'(v" ) "V" to see that the representative t of w has the form vc. [&5 

Lemma 6.6. Lef weF'-R. Then the representative in L of w factorizes as uv ro/zere 
u£F' with |u| < m + k + kn and v G Ap 

Proof of 6.6. Let w be in the hypothesis and let t be its representative in L. 
Since t cannot lie in Ap, it contains some letter from Ap U Aq U Ap/ U An- The 
rightmost such letter cannot lie in Aq U Aq, since this would force t to lie in 
Q U {O}. So the rightmost such letter is either from Ap or Ap. 

If the rightmost such letter is from Ap/, then by Lemma 6.4, t = u'av, 
where a G Ap/, |u'| < n, v G Ap. Set u = u'a. Then u = a and so |u| < m < 
m + nk and there is nothing more to prove. 

So suppose the rightmost such letter is from Ap. Then by Lemma 6.4, 
t = t'bt", where b G A P , [t'| < n, t" G Ap. Then w = t = bt 77 . Now, if 
t" G A*, then bt" G P by the definition of the action. So t" contains some 
letter from Ap — A x . Let t" = scv, where this distinguished letter c is the 
leftmost letter of t" that is from A F — A x , so that s G A*. Then bs G P and 
bsc G F' since the alternative bsc G Q U {D.} cannot happen since this set is 
closed under the action of v. 

Thus far t has been factorized as t'bscv. The next step is to show that |s| < 
n. Suppose for reductio ad absurdum that |s| ^ n. Then s factorizes as s W", 
where t / bs / ( s") t V"cv G L for all a G N U {0}. Now, since s = sW" G A*, 
the elements t / bs / (s") a s / " are a sequence of elements pi a whose indices i K 
form a linear progression. But the indices of the elements p; E P such that 
pt • c G F' are the terms of an exponential function. So there are infinitely 
many a G N U {0} such that t'bs^s'O'V'c = qj G Q U {O}. 

Reasoning as in the third paragraph of the proof of Lemma 6.5, c = x$y. 
Now, if v 7^ e, then t / bs / (s") oc s / "cv = D. for infinitely many a G NU{0}, which 
contradicts uniqueness of representatives. If, on the other hand, v = e, then 
w = t = t' bsc = jij) for some j since t'bs G P, c = x^y, and t' bsc G F'. So 
w = j4> G R, which contradicts the hypothesis of the lemma. Hence |s| < n. 

Therefore |s| < kn since each letter of s represents a word in A* of length 
at most k. 

Now, b = p m /, where m/ < m by the definition of m. Hence bs = p m /s = 
Ph for some h < m + kn by the definition of the action of x on the pt. Suppose 
c = x^yz for some and z G {x,y}*. Then (3 + |z| < k. Since t'bsc G it follows 
that h + (3 = V for some j G N U {0}. Hence t'bsc = wz, where w G R with 
|w| < 2' . Now, 

|t'bsc| = |wz| = |w| + |z| < |jiW + |z| < 2' + |z| = h+(3 + |z| < h + k < m + k + kn. 
Let u = t'bsc. Then t = uv with |u| < m + k + kn. [6^1 

Choose w G R with |w| > m + n + k + kn. Then |w| y / is even and so 
Iwfx'^y'l-y / is odd, so that w(x') y' ^ R. Let t be the representative in L of 
w(x / ) 2k y'. Then by Lemma 6.6, t factorizes as uv, where the v is the longest 
suffix lying in Ap and u G F' with [u| < ra + k + kn 
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In particular, |w(x') 2k u ' > m + 3k + kn. Since |u| < m + k + kn, it follows 
that |v| > 2k. Since each letter of v represents an element of F of length at most 
k, the word v has length at least 2. So let v = v'ab, where a, b G Ap. Since 
|a[, |b| < k, t = uv = w(x / ) 2k y' and u G ¥', it follows from the action of F on 
F CT that uv 7 = w(x') a for some a G {1, . . .,2k}, a = x 13 (so that a G A x ), 
and b G A F — A x . 

Let t' = uv'a. Then t' = w(x') a+ ' 3 . By prefix-closure, t' G L. Observe that 
t' ends with a G A x . 

Now, the word w(x') a+|3 lies in R since |w| y / = |w(x') a+|3 |y/ is even. So by 
Lemma 6.5, its unique representative t' must factorize as sc, where c = x&y, 
so that c G Ap — A x . This contradicts the fact that t' ends with a letter from 
A x . 

Thus F[T] does not admit a Markov language. [Kj 



7 REWRITING SYSTEMS 

Confluent noetherian rewriting systems form a natural source of 
examples of Markov semigroups. The following result is easily noticed, but 
will prove very useful: 

Proposition 7.1. Let (A, 3i) be a confluent noetherian rewriting system with the set 
of left-hand sides of rewriting rules in 3? being regular. Then the monoid presented 
by (A I 3i) is Markov, and its language of normal forms is a Markov language. Fur- 
thermore, if [A y 3?) is non-length-increasing, then the language of normal forms is a 
robust Markov language for the monoid. 

Proof of j. 1. The language L = A* — {I : (£,r) G 3J}, which is the language of 
normal forms of (A, 3£), is regular, prefix-closed, and maps bijectively onto the 
monoid presented by (A | For the final observation, notice that if (A, 3i) is 
non-length-increasing, then the language of normal forms consists of minimal- 
length representatives. [7T] 

It is worth emphasizing that Proposition 7.1 says that being Markov is a 
necessary condition for a semigroup to be presented by a confluent noetherian 
rewriting system, although it is probably not as useful as other necessary 
conditions such as finite derivation type [SOK94], which are independent of 
the choice of generating set. 

However, the following example shows that a semigroup presented by a fi- 
nite confluent noetherian non-length-increasing rewriting system can admit a 
robust Markov language that looks very different from its language of normal 
forms: 

Example 7.2. Let A = {a, b} and 3? = {(a 2 ,ba), (b 2 , ab)}. Then (A, 3?) is con- 
fluent and noetherian. Let L be its language of normal forms; this is a robust 
Markov language by Proposition 7.1. Then L is the language of words over A 
that do contain neither two consecutive letters a nor two consecutive letters b; 
thus L is the language of alternating products of letters a and b: 

L = (A* — A*aaA*) — A*bbA* 
= (ab)* U (ab)*aU (ba)* U (ba)*b. 

Let M be the monoid presented by (A | 3£). Let 

K = ab* Uba*. 
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The aim is to show that K is also a Markov language for M. Notice first that 
K is prefix-closed and regular and so it remains to show that it consists of 
unique minimal-length representatives for M. 
Notice that for any a G N U {0}, 

(ab)« = aMab) 1 *- 1 = ab(b 2 ) a - 1 = ab 2 *- 1 

and 

(ab) a a = aMab)*- 1 a = hblabj^a = b(ba) a = b(a 2 ) a = ba 2oc . 

Parallel reasoning shows that (ba) a = ba 2a_1 and (ba) a b = ab 2a . Thus 
every word in L represents the same element as exactly one element of K and 
vice versa. Furthermore, the lengths of the corresponding words in L and K 
are the same. Hence, since L is a robust Markov language for M by Proposition 
7.1, K is also a robust Markov language for M. 

Question 7.3. Is every Markov semigroup presented by a confluent noethe- 
rian rewriting system where the language of left-hand sides of rewriting rules 
is regular? (That is, where the language of all left-hand sides is regular: Ex- 
ample 11.9 below shows that the language of left-hand sides of rules with a 
particular right-hand side may be irregular.) 



8 MARKOV, ROBUSTLY MARKOV, AND STRONGLY MARKOV SEMIGROUPS 

The example in § 6 consists of a non-Markov monoid that admit- 
ted a regular language of unique representatives over any alphabet represent- 
ing a generating set. The present section gives an example of a monoid that is 
Markov but not robustly Markov (Example 8.1) and an example of a monoid 
that is robustly Markov but not strongly Markov (Example 8.4). These three 
examples together show that the classes of Markov, robustly Markov, and 
strongly Markov semigroups are distinct. 

Example 8.1. Let 

P = { Pi : i G N}, 

Q = {q i: ieNA-(3j €N)(i = 2i)}, 

R = {T t : i G N}, 

S = {Si : i G N}, 

T = PUQURUSU {O}. 

Let F be a free monoid with basis X = {x,y}. Define an action of X on T as 
follows 



Pi 


X 


= pi+l, 


Pi 


y 




qi 


X 


= n, 


qi 


y 


= n, 


n 


X 


= n+i, 


n 


y 


= Si, 


Si 


X 


= n, 


Si 


y 


= n, 





X 


= n, 


o 


y 


= a. 



if i ^ V for any j G N U {0}, 
if i = V for some j G N U {0}, 



Since F is free on X, this action extends to a unique action of F on T. Figure 2 
shows the graph of the action of X on T. Propositions 8.2 and 8.3 below show 
that F[T] is strongly Markov but not robustly Markov. 
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Figure 2: Part of the graph of the action of X on T. Edges which lead to O are 
not shown 



Proposition 8.2. The monoid F[T] is Markov. 

Proof of 8.2. Let A = {a, b, c, d, e} be an alphabet representing elements of F[T] 
as follows: 

a = x, b=u, c = pi, d = ri, e = CI. 

Let K = {a, b}* U ca* U ca*b U da* U {e}. Then K is prefix-closed, regular, and 
maps bijectively onto F[T]. In particular, the subset {a, b}* maps bijectively 
onto F, the subset ca* maps bijectively onto P, the subset ca*b maps bijectively 
onto Q U S, and the subset da* maps bijectively onto R. Thus K is a Markov 
language for F[T]. [aT] 

Proposition 8.3. The monoid F[T] is not robustly Markov. 

Proof of 8.3. Suppose, with the aim of obtaining a contradiction, that F[T] ad- 
mits a robust Markov language L over some alphabet A. 
Define the following subalphabets of A: 

A P = {a G A : a G P}, 
Aq = {a G A : a G Q}, 
Ar = {a G A : a G R}, 
A s ={a G A : a G S}, 
A F = {a G A : a G F}, 
A x = {a G A : a G x + }, 
A Q = {a G A : a = O}; 

notice that A is the disjoint union of Ap, Aq, Ar, As, A f , and Aq. Let 

mi = max{i : pt G Ap}, 
m.2 = max{i : qt G Aq}, 
m.3 = max{i : r^ G Ar}, 
m.4 = max{i : si G As}, 
m = max{mi , m.2, m.3, m.4}. 

Let k = max{|a| : a G A F }. 

Reasoning as in the proof of Lemma 6.5, one sees that for i sufficiently 
large, St is represented by a word of the form vc, where v G A*, c G Ap — A x , 
v G P, and c = x^y for some (3 < k. 

Let v = v'bv", where v" G A F . Then b G Ap and so v'b = b = p m ' for some 
m' < m. Now, st = vc = v'bv"c = p m iv"x^y, and so by the definition of the 
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Figure y. Part of the graph of the action of X on T. Edges corresponding to 
actions which fix elements of T are not shown 



action, p 2 i = ip rrv iv // xP . Thus v" = s 2 ' m ' 13 . So each letter of v" lies in A x . 
Furthermore, since each such letter represents an element of length at most k, 
it follows that |v"| > (2 1 - ra - |3)/k and further that |v| > [2 i - m - |3)/k + 2. 

Since v" G A*, the subalphabet A x must be non-empty. Let a G A x , with 
a = x Y . Since (T — Q) • F does not contain any element of Q, the subalphabet 
Aq is non-empty and contains some letter b with b = q^. 



Then ba h c 



V = qa+yH+py = s a+y H+p. By choosing h large 



enough, s a+Y H+p is represented in L by a word v of length greater than 
(2 a+YH+|3 — ra — |3)/k + 2. Again choosing h large enough, so that 

(2 «+yh+p _ m _ + 2 > h + 2. 

one obtains |v| > |ba H c|. Thus v is not a minimal-length representative of 
Soc+yh+p/ which contradicts L being a robust Markov language for F[T]. 



Example 8.4. Let 

P={ Pi :i€NU{0}}, 
Q = {q t : i € N}, 
R = {n : i G N}, 
T = P U Q U R. 

Let F be a free monoid with basis X = {x, y , z}. Define an action of X on T as 
follows 



Pi-x = pi +1 . 

pi -v = qt* 

Pi-Z = Pi, 



qt if i 7^ 23 for any j G NU{0}, 

r t if i = 2' for some j G N U {0}, 

qi-y = qi) n-y=n, 

' q t if i = 2^ for some j G N U {0}, 

T t if i ^ 2i for any j G N U {0}, 



qt ■ x 



qi-z 



n • x = n, 



n • z = n. 



(Notice that qt is fixed by one of x or z and sent to r^ by the other, and that 
which letter fixes qi and which sends it to depends on whether i is a power 
of 2.) Since F is free on X, this action extends to a unique action of F on T. 
Figure 3 shows the graph of the action of X on T. Propositions 8.5 and 8.6 
below show that F[T] is robustly Markov but not strongly Markov. 



Proposition 8.5. The monoid F[T] is robustly Markov. 

Proof of 8.5. Let A = {a, b, c, d, e, f} be an alphabet representing elements of 
F[T] as follows: 

a = x, b=y, c = z, d = yx, e=yz, f = p . 



18 



Let A' = A— {f}. Then (A', {(ba, d), (be, e)}) is a confluent noetherian rewriting 
system presenting the subsemigroup F of F[T]. Hence its language of normal 
forms Ki = A* — A* (ba U be) A* is a robust Markov language for the subsemi- 
group F of F[T] by Proposition 7.1. 

Let K2 = fa* U f a + d U f a + e. Then K2 is +- prefix-closed and regular. The 
subset fa* maps bijectively onto P. The subsets fa + d and fa + e map bijectively 
onto Q U R, since for each i G N, exactly one of the following cases holds: 

• fa x d = poxhjx = piyx = qix = Vi and fa 1 e = pox x uz = ptyz = q\z = qi 
(this holds if i = V for some j G N); 

• fa x d = poxHjx = piyx = qtx = qt and fa r e = poxHjz = Piyz = qtz = r\ 
(this holds if i ^ 7) for any j G N). 

Thus K2 maps bijectively onto T. 

It remains to show that every word in K2 is a minimal length representa- 
tive. Let u G A* represent pi. Then u must contain f, since all other letters in 
A represent elements of F. So let u = u'fu", where u" G (A — {f})*, so that 
this distinguished letter f is the rightmost such letter in u. Each symbol in 
A — {f } represents an element of F that contains at most one letter x. So, by the 
definition of the action on the pi, it follows that u" must contain at least i let- 
ters. Hence |u| ^ i + 1 . Any word over A representing qt or must therefore 
have length at least i + 2. By the observations in the preceding paragraph, the 
representative in K2 of pi has length i + 1, and those of qi and rt both have 
length i + 2. 

Therefore the language Ki U K2 is prefix-closed, regular, and consists of 
minimal-length representatives for F[T]. So K] U K2 is a robust Markov lan- 
guage for F[T]. [&5 

Proposition 8.6. The monoid F[T] is not strongly Markov. 

Proof of 8.6. Suppose, with the aim of obtaining a contradiction, that F[T] is 
strongly Markov. Let A = {a, b, c, f} represent elements of F[T] as follows: 

a = x, b=y, c = z, f = po- 

Since F[T] is strongly Markov, it admits a robust Markov language L over 
the alphabet A. Let n be greater than the number of states in an automaton 
recognizing L. Choose k such that 2 k > n. 

It is easy to see that the unique shortest word over A representing r 2 k is 
fa 2 ba. Therefore this word lies in L. By the pumping lemma, a 2 factorizes 
as v VV", where v', v", v'" G a* and fv / (v") oc v / "ba G L for e very ex. G N U { 0}. 
Choose a so that m = \v' {v") a v"'\ is not a power of 2. Then fv'(v") a v /// b = 
q m , and fv'(v") a v'"ba = q m x = q m . Hence fv'(v") a v'"b and fv>")°V"ba 
represent the same element of F[T]. Since both these words lie in L by prefix- 
closure, this contradicts the uniqueness of representatives in L. \8£] 



9 COMMUTATIVE SEMIGROUPS 

That finitely generated commutative semigroups are Markov could 
be deduced from Proposition 7. 1, and the fact that finitely generated commu- 
tative monoids have presentations via finite confluent noetherian rewriting 
systems [Die86], and the closure of the class of Markov semigroups under 
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adjoining and removing an identity (Proposition 13.1 below). However, a 
stronger result holds: 

Proposition 9.1. Finitely generated commutative semigroups are strongly Markov. 

[The first part of the following proof parallels the proof that all commuta- 
tive cancellative semigroups are automatic; see [Caio5, Theorem 5.4.2].] 

Proof of 9.1. Let A be a finite alphabet representing an arbitrary generating set 
for some commutative semigroup S. Suppose A = {ai , . . . , a n }. Consider 
elements of S using tuples: identify the tuple (ai , . . . , a n ) with the element 
a* 1 • • • a"". Define the ShortLex ordering ^sLex of these tuples by 

n n 

(«!,..., a n ) ^ S Lex (Pi, • • • , Pn) Y- <*i < Y- $ u OT 

i=l 1=1 
1- n n 

Y_ on = Y_ pi 

-i=1 t=1 

and («!,..., a n ) C L ex (Pi, • • • > Pn) , 

where CLex is the lexicographical order of tuples: (ai , . . . , a n ) CLex (Pi , • • • , Pn) 
if the leftmost non-zero coordinate of (Pi — ai , . . . , p n — <x n ) is positive. 

Redei's Theorem [Red63] asserts that S is finitely presented. An approach 
to this theorem found in [RGS99, Chapter 5] (which is a modification of the 
proof in [Gri93]) shows that the semigroup S is isomorphic to 

[(N U {0}) n - {(0, . . . , 0)}] /{( U1 , V! ), . . . , (Un, V n )} # , 

where Ui -<sLex v i/ and such that the ShortLex-minimal representative of W € 
(N U {0}) n — {(0, . . . , 0)} can be found by repeatedly replacing w by w — V| + 
whenever every coordinate of w — Vi is non-negative. (Addition is performed 
componentwise on tuples.) 

Since the ShortLex order is compatible with the operation (that is, for all 
x G S, u ^sLex v =>■ u + x ^sLex v + x), the set of ShortLex-minimal elements 
is simply 

M = {w € (N U {0}) n - {(0, . . . , 0)} : w - Vi is not in (N U {0}) n for any i} . 
Let 

K={ ai ai •••<- : (ai,...,a n ) G M}. 

Since the number of Vi is finite, a finite state automaton can check whether a 
word a* 1 • • • a* n lies in K. Therefore K is regular. 

Finally, notice that if a word a* 1 • • • a* n lies in K, then one obtains its 
longest proper prefix by decreasing by 1 the right-most non-zero exponent a^. 
(Recall that some of the at, but not all, can be 0.) Thus if w is the tuple in M 
corresponding to a word in K, then the tuple w' corresponding to its longest 
proper prefix is obtained by decreasing the right-most non-zero coordinate by 
1 . Hence if w - v t £ (N U {0}) n then w'-v;^(NU {0}) n . Consequently K is 
closed under taking longest proper non-empty prefixes, and so, by iteration, 
is +- prefix-closed. By the definition of the ShortLex ordering, the language K 
consists of minimal-length representatives. So K is a robust Markov language 
for S. Since the generating set represented by A was arbitrary, S is strongly 
Markov. [9T] 
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Finitely generated abelian groups are Markov, as a consequence of the 
more general result that finitely generated polycyclic groups are Markov [GdlHcjoa, 
Corollaire 11]. However, that finitely generated abelian groups are strongly 
Markov (an immediate corollary of Proposition 9.1) does not seem to have 
been explicitly noted anywhere, although it is implicit in [ECH+92, Chs 3-4]. 



10 VIRTUALLY ABELIAN, NILPOTENT, AND POLYCYCLIC GROUPS 

It is known that nilpotent groups need not be strongly Markov, 
since they may have irrational (indeed, transcendental) growth functions with 
respect to some generating sets [Sto96, Theorem B]. Furthermore, there ex- 
ist virtually abelian groups that do not admit any regular language of min- 
imal length representatives over some generating set (that is, even without 
requiring uniqueness) [NS97]. Thus virtually abelian groups are not in gen- 
eral strongly Markov. 

Question 10.1. Are finitely generated semigroups that are nilpotent (in the 
sense of Malcev [Mah^]) Markov? In particular, are all finitely generated 
subsemigroups of nilpotent groups are Markov? 

This section exhibits two examples to show that finitely generated sub- 
semigroups of virtually abelian groups and of polycyclic groups need not be 
Markov. All finitely generated subgroups of such groups are Markov, since 
these classes of groups are closed under taking subgroups. 

The example of a non-Markov subsemigroup of a virtually abelian group 
(Example 10.4) is particularly important: First, it shows that the class of groups 
all of whose finitely generated subsemigroups are Markov is not closed under 
forming finite extensions. Second, virtually abelian groups satisfy a non-trivial 
semigroup identity and thus have the following property: if S is a subsemi- 
group and H the subgroup it generates, then H is [isomorphic to] the universal 
group of S. In general groups, this is not true: H is in general a homomorphic 
image of the universal group of S. (The universal group of S is the group 
obtained by taking a presentation for S and considering it as a group presen- 
tation; see [CP67, Ch. 12] or the discussion in [Caio5, § 5.2.1] for background 
information.) Thus the example is a non-Markov semigroup with a Markov 
universal group. 

The following technical result will be used in proving both examples non- 
Markov: 

Lemma 10.2. Let S be a semigroup and A = {a, b,c, d, e, f, g,h, an alphabet 
representing a finite generating set for S. Suppose that for a, (3 € NU{0} with a / |3 
the following conditions hold: 

1. The element represented by ab^cd 13 e is represented by no other word over A. 

2. The element represented by fg a hi^j is represented by no other word over A. 

3. The equality ab a cd a e = fg a rti a j holds, and the only words representing this 
element are ab oc cd oc e and fg a hi a j. 

Then S is not Markov. 

Proof of 10.2. Suppose for reductio ad absurdum that S is Markov. Then it admits 
a regular language of unique representatives L over A by Proposition 5.3. So 
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K = L n (ab*cd*e U fg*M*j) is regular. By assumption, when a / |3, the 
element represented by ab a cd |3 e is represented by no other word over A. 
Thus K D {ab a cd p e : a ^ |3} and similarly K D {fg 0C M |3 j : a / |3}. Thus 
ab*cd*e-K C {ab a cd a e : a e NU{0}}and fg*hi*j-K C {fg^hl^e : « € NU{ 0}}. 

Furthermore, the only representatives over A of the element ab a cd a e are 
ab a cd a e and fg a hi a j. So at least one of the regular languages ab*cd*e — K 
and f g*hi*j — K is infinite. Assume the former; the latter case is similar. Since 
ab*cd*e — K C {ab a cd a e : a £ N U {0}} is infinite, it contains arbitrarily long 
words ab a cd a e. So a string of symbols b can be pumped, which contradicts 
the fact that every word in this language is of the form ab a cd a e. Thus S is 
not Markov. [Tail 

Example 10.3. The semigroup presented by 

(a,b,c, d, e,f, g,h,i,j | ab a cd a e = fg a ht a j, a e N U 0} , 

which is isomorphic to a subsemigroup of a polycyclic group [Caiotj, § 3], is 
not Markov by Lemma 10.2 above. 

Example 10.4. Let Si 1 be the symmetric group on eleven elements. Let Z 1 1 be 
the direct product of eleven copies of the integers under addition. View ele- 
ments of Z 1 1 as 11 -tuples of integers. Let G = §n ixZ 11 , where §1 1 acts (on the 
right) by permuting the components of elements of Z 1 1 . (The Z-components 
are indexed from 1 at the left to 1 1 at the right.) The abelian subgroup Z 1 1 of 
G has index 1 1 !, so G is a virtually abelian group. 

Let A = {a, b, c, d, e, f , g, h, i, j} be an alphabet representing elements of G 
in the following way: 

a =[(13), (0,1,1,0,0,0,1,0,0,0,0)], f = [(1 5), (0,1,0,0,1,0,1,0,0,0,0)], 

b = [id, (0,0,1,0,0,0,0,0,0,1,0)], g = [id, (0,0,0,0,1,0,0,0,0,0,1)], 

E= [(13)(2 4), (1,0,0,0,0,0,0,1,0,0,0)], h=[(15)(2 6), (1,0,0,0,0,0,0,1,0,0,0)], 
d=[id, (0,0,0,1,0,0,0,0,0,-1,0)], i = [id, (0,0,0,0,0,1,0,0,0,0,-1)], 

e = [(2 4), (0,1,0,0,0,0,0,0,1,0,0)], j = [(26), (0,1,0,0,0,0,0,0,1,0,0)]. 

Let S be the subsemigroup of G generated by A. The aim is show that S is not 
Markov. [We admit that the generators in A may look intimidating. However, 
they interact in a fairly nice way, and the method in their madness will become 
apparent.] 

First of all, some preliminaries are necessary. For any a, (3 € N U {0}, 



ab a cdPe 

[(13), (0,1,1, 0,0,0,1, 0,0,0,0)][id, (0, 0, a, 0, 0, 0, 0, 0, 0, a, 0)]cdFe 

[(1 3), (0,1,a + 1,0,0,0,1,0,0,a,0)][(1 3) (2 4), (1,0,0,0,0,0,0, 1,0, 0,0)] We 

[(2 4), (a + 2, 0, 0, 1 , 0, 0, 1 , 1 , 0, a, 0)][id, (0, 0, 0, |3, 0, 0, 0, 0, 0, -|3, 0)]e 

[(2 4), (a + 2, 0, 0, (3 + 1 , 0, 0, 1 , 1 , 0, a - (3, 0)] [(2 4), (0, 1 , 0, 0, 0, 0, 0, 0, 1 , 0, 0)] 

[id, (a + 2, (3 +2, 0,0, 0,0, 1,1,1, a - (3,0)], 
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and 



= [(15), (0,1, 0,0, 1,0, 1,0, 0,0,0)] [id, (0, 0, 0, 0, a, 0, 0, 0, 0, 0, a)]hiPj 

= [(1 5), (0,l,0,0,cx+l,0,l,0,0,0,cx)][(l 5)(2 6), (1,0,0,0,0,0,0, 1,0,0, 0)W] 

= [(2 6), (a + 2, 0, 0, 0, 0, 1 , 1 , 1 , 0, 0, a)] [id, (0, 0, 0, 0, 0, |3, 0, 0, 0, 0, -|3)]j 

= [(2 6), (a + 2, 0, 0, 0, 0, (3 + 1 , 1 , 1 , 0, 0, a - (3)] [(2 6), (0, 1 , 0, 0, 0, 0, 0, 0, 1 , 0, 0)] 

= [id, (a + 2,|3 +2,0,0,0,0,1,1,1,0,a- |3)]. 

In particular, ab a cd a e = fg a hi a j. 
Lemma 10.5. Let a, |3 G N U {0} with a ^ |3. 

1. Tfe onZy word over A representing ab^cd^e is ab^cd^e, and the only word 
over A representing fg^hiPj is fg^hi 13 ]. 

2. 77/e on/y words over A representing ab a cd a e = fg^hi^j are ab a cd a e and 
fg a hi a j. 

Proof of 10.5. Let a, |3 G N U {0}. For the present, allow the possibility that a 
and (3 are equal. 

Let w G A + be some word representing 

s = ab a cdPe = [id, (a + 2, (3 + 2, 0, 0, 0, 0, 1 , 1, 1 , a - |3, 0)]. (10.1) 

Let A' = {a, c, e, f, h, j}; observe that A' consists of exactly those elements of A 
representing elements with non-zero seventh, eighth, and ninth Z-components, 
which are also exactly those that have non-identity Si 1 -components. Let 
A" = A — A' = {b, d, g,i} observe that A" consists of exactly those elements 
of A representing elements with non-zero tenth and eleventh Z-components, 
which are also exactly those that have identity §1 1 -components. 

First, consider which letters from A' can appear in w. Examining the 
seventh, eighth, and ninth Z-components (which are unaffected by the actions 
of any of the §n -components), shows that w contains one letter a or letter e, 
one letter c or letter h, and one letter e or letter j, and no other letter from A'. 
For the product of the §n -components to be id, the letters from A' inw must 
then be a, c, e or f , h, j (in some order). 

Consider these two cases separately: 

1. Suppose first that the letters from A' in w are a, c, e. Since the §n- 
components of a, c, e do not affect the fifth and sixth Z-components, and 
since these are both in s, w cannot contain letters g or h. So w is a 
rearrangement of aceb y d 5 for some y, 5 G N U {0}. Now, in w the let- 
ter a must precede the letter c, for otherwise the third Z-component of w 
would be non-zero. Similarly, c must precede e, for otherwise the fourth 
Z-component of w would be non-zero. The letters b must all lie between a 
and c, for otherwise the third Z-component of W would be non-zero, and 
similarly the letters d must all lie between c and e, for otherwise the fourth 
Z-component of w would be non-zero. So w = ab Y cd 6 e. Examining the 
first and second Z-components forces y = a and 5 = (3. So if the letters 
from A' in w are a, c, e, then w = ab a cd |3 e. 
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2. Suppose now that the letters from A' in w are f, h, j. Since the §n~ 
components of f, h, j do not affect the third and fourth Z-components, 
and since these are both in s, w cannot contain letters b or c. So w is 
a rearrangement of fhjg Y i 5 for some y, 5 G N U {0}. Now, in w, the let- 
ter f must precede the letter h, for otherwise the fifth Z-component of w 
would be non-zero. Similarly, h must precede j, for otherwise the sixth 
Z-component of w would be non-zero. The letters g must all lie between 
f and h, for otherwise the fifth Z-component of w would be non-zero, and 
similarly the letters i must all lie between h and j, for otherwise the sixth 
Z-component of w would be non-zero. So w = fg Y hi 6 j. Examining the 
first and second Z-components forces y = a and 5 = |3. So if the letters 
from A' in w are f, h, j, then w = fg^hi^j. In this case, 

w = [id, (a + 2, |3 + 2, 0, 0, 0, 0, 1 , 1 , 1 , 0, a - (3)]. 

By (10.1), this forces a = (3, and w = fg a hi a j. 

So if a ^ (3, only the first case holds and w = ab a cd' 3 e. If, on the other hand, 
a = (3, then both cases can hold and w is either ab a cd a e or f g a hi a j. 

Parallel reasoning shows that if a ^ |3, the element represented by fg a hi |3 j 
is represented by no other word over A. [TgJ] 

By Lemma 10.5, the semigroup S satisfies the hypotheses of Lemma 10.2 
and so is not Markov. 



11 MISCELLANEOUS EXAMPLES OF MARKOV AND NON-MARKOV 
SEMIGROUPS 

This section gathers miscellaneous examples to illustrate partic- 
ular aspects of the class of Markov semigroups. 

First, here is an example of a non-Markov semigroup: 

Example 11.1. Let A = {a, b, c, d} and let 

S = {(ba, ab), (be, aca), (acc, d)} U {(dx, d), (xd, d) : x e A}. 

The monoid presented by (A | S) does not admit a regular language of unique 
representatives by [OKK98, Example 4.6], and thus is not Markov. 

Since free groups of finite rank are Markov (either by Proposition 7. 1 or as 
a corollary of [GdlHc^oa, Proposition 9]) and indeed strongly Markov (since 
they are hyperbolic; see [GdlHgoa, Theoreme 13]), the following example is 
worth noting: 

Example 11.2. The free inverse monoid of rank 1 is not Markov, because it 
admits no regular language of unique normal forms over the generating set 
[CS01, Proof of Theorem 2.7]. 

The Baumslag-Solitar groups play their customary role of being pleasant 
and easy to understand but slightly eccentric. This is a consequence of the 
following theorem of Groves: 

Theorem 11.3 ([Grc>96, Corollary in § 1]). There is no regular language of minimal- 
length representatives for the Baumslag-Solitar groups 

(MKt-'at,^)), 

where p > 1 with respect to the alphabet {a, a -1 , t, t _1 }. 
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[The original statement of this result by Groves is phrases in terms of 
minimal-length (unique) normal forms. However, the property of unique- 
ness is not used anywhere in the proof. Groves states the result in these terms 
because he places the result in the context of calculating growth series.] 

Example 11.4. The Baumslag-Solitar group (a, 1 1 (t _1 at, a 2 )) is presented by 
the a confluent noetherian rewriting system [ECH + Q2, p. 156], and is therefore 
Markov by Proposition 7.1. However, since it admits no regular language of 
minimal-length representatives by Theorem 11.3, it is not strongly Markov. 
(However, it does admit a one-counter language of minimal-length normal 
forms [Eldo5, §§ 4-5].) 

This example leads on to the following question: 

Question 11.5. Is every one-relation semigroup Markov? 

If every one-relation semigroup can be presented by a confluent noetherian 
rewriting system (an open question, since it would imply a solution to the 
world problem), this question would have a positive answer by Proposition 
7.1. 

A robustly Markov monoid may not be residually finite: 

Example 11.6. Let A = {a, b} and let 3? = {(ab 2 , b)}. Then (A, 3?) is a confluent 
noetherian rewriting system and so the monoid M presented by (A | 3i) is 
Markov by Proposition 7.1. This monoid M is known to be non-residually 
finite [Lal74]. 

A strongly Markov monoid may not be finitely presented: 

Example 11.7. Let A = {a,b,c,d,e,f} and 3? = {(ab n c, de n f) : n € N}. Let 
M be the monoid presented by ( A | 3?) . Then M is not finitely presented since 
no relation in 3J can be deduced from the others. But M is strongly Markov: 
since every generators in A is indecomposable, any alphabet representing a 
generating set for M must contain a subalphabet representing A; thus A* — 
A*ab*cA* is a robust Markov language for M over any alphabet representing 
a generating set. 

This example suggests the following question: 

Question 11.8. Does there exist a strongly Markov group that is not finitely 
presented? If not, does there exists a non-finitely presented Markov or ro- 
bustly Markov group? [The authors conjecture that the answers to these ques- 
tions are both yes, for intuition suggests that a Markov or robust Markov 
language does not impose enough structure on a group to guarantee finite 
presentability] 

The following easy example shows that it is possible for a robustly Markov 
monoid to have unsolvable word problem: 

Example 11.9. Let I be a non-recursive subset of N. Let A = {a, b, c, x, y} and 

3? = {(ab a c,x) : a€ I}U{(ab a c,y) : a I}. 

The rewriting system (A, 31) is confluent because left-hand sides of rules in 
3? overlap only when they are identical. It is noetherian because it is length- 
reducing. The language of left-hand sides of rules in 3? is 

{ab a c : a € 1} U {ab a c : a g 1} = ab*c 
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and so is regular. By Proposition 7.1, L is a robust Markov language for the 
monoid presented by (A | 01} . 

However, this monoid does not have solvable word problem, since ab a c 
and x represent the same element of the semigroup if and only if a G I. But 
membership of I is undecidable since I is non-recursive. 

However, a finitely presented Markov semigroup will have soluble word 
problem, as does any finitely presented semigroup that admits a recursively 
enumerable language of unique representatives [CS01, Theorem 1.5]. 



12 HYPERBOLICITY & AUTOMATICITY 

Ghys et al. proved that hyperbolic groups are Markov using a di- 
rect approach [GdlHcjoa, §3]. It also follows using the machinery of automatic 
groups: over any generating set, the language of geodesies is regular and 
forms part of a prefix-closed automatic structure [ECH + 92, Theorem 3.4.5], 
and the construction of an automatic structure with uniqueness [ECH + 92, 
Theorem 2.5.1] preserves prefix-closure when applied in this particular case 
(although not in the general case). 

Hyperbolicity can be generalized from groups to semigroups in either a 
geometric or linguistic sense. The latter generalization, which is termed word- 
hyperbolicity, is due to Duncan & Gilman [DG04]. It informally says that a 
semigroup is word-hyperbolic if it admits a regular language of representa- 
tives such that the multiplication table in terms of these representatives is a 
context-free language. 

Definition 12.1. A word-hyperbolic structure for a semigroup S is a pair (A, L), 
where A is a finite alphabet representing a generating set for S and L is a 
regular language over A such that L = S and the language 

M(L) = {u#! v# 2 w rev : u, v, w G L A uv = w} 

(where #i and #2 are new symbols not in A) is context-free. 

A semigroup is word-hyperbolic if it admits a word-hyperbolic structure. 

A group is word-hyperbolic in the sense of Definition 12.1 if and only if it 
is hyperbolic in the sense of Gromov [DG04, Corollary 4.3]. For further back- 
ground information on word-hyperbolic semigroups, see [DG04, HKOT02]. 

The following example is taken from [CM, Example 4.2]: 

Example 12.2. Let A = {a,b,c, d} and let % = {(ab a c a d, e) : a G N}. Let 
M be the monoid presented by (A | 01). Since the rewriting system (A, 01) is 
context-free, M is word-hyperbolic by [CM, Theorem 3.1]. The reasoning in 
[CM, Example 4.2] shows that it does not admit a regular language of unique 
normal forms over any generating set, and so in particular cannot be Markov 
by Proposition 5.3. 

Thus word-hyperbolic monoids are not in general Markov. Moreover if the 
regularity condition on the left-hand sides of rewriting rules in Proposition 
7.1 is weakened to being context-free (or even just to being one-counter), then 
the semigroups or monoids thus presented are not Markov in general. 

Example 12.2 is not finitely presented, and it does not admit a word- 
hyperbolic structure with uniqueness [CM, Example 4.2]. This provokes the 
following questions: 



26 



Question 12.3. Does there exist a non-Markov finitely presented word-hyperbolic 
monoid? 

Question 12.4. Does there exist a non-Markov monoid that admits a word- 
hyperbolic structure with uniqueness? 

Since satisfying a linear isoperimetric inequality is one of several equiva- 
lent characterizations of hyperbolic groups (see, for example, [ABC + 9i, Ch. 1]), 
the following question is of interest: 

Question 12.5. Does there exist a non-Markov semigroup with linear isoperi- 
metric inequality? 

Markov groups are not in general automatic, since all polycyclic groups are 
Markov [GdlHc^oa, Corollaire 11], but a nilpotent group that is not virtually 
abelian cannot be automatic [ECH + 92, Theorem 8.2.8]. 

Question 12.6. Are automatic semigroups Markov? (Note that, unlike the 
situation for groups, an automatic semigroup need not be word-hyperbolic.) 
This question relates to the long-standing open question of whether an au- 
tomatic semigroup or group admits a prefix-closed automatic structure with 
uniqueness [ECH + 92, Open Question 2.5.10]. Admitting such an automatic 
structure entails being Markov. 



13 ADJOINING AN IDENTITY OR ZERO 

This section and those that follow examines the interaction of 
the classes of Markov, robustly Markov, and strongly Markov semigroups 
with various semigroup constructions. The main questions are whether these 
classes of semigroups are closed under a particular construction, and whether 
the semigroup resulting from such a construction being Markov, robustly 
Markov, or strongly Markov implies that the original semigroup is (or the 
original semigroups are) Markov, robustly Markov, or strongly Markov. 

Arguably the simplest semigroup construction are the adjoining of an iden- 
tity or zero, and it is reassuring that both questions have positive answers for 
these constructions: 

Proposition 13.1. Let S be a semigroup. Then: 

1. S is Markov if and only if S 1 is Markov. 

2. S is robustly Markov if and only ifS is robustly Markov. 

3. S is strongly Markov if and only if S 1 is strongly Markov. 

Proof of 13.1. Let A be a finite alphabet representing a semigroup generating 
set for S. Let 1 be a new symbol not in A representing the adjoined identity 
ofS 1 . 

Let L be a semigroup Markov language for S with respect to A. Then L is 
regular, +-prefix-closed, and maps bijectively onto S. Let K = LU{1}. Then K is 
regular, +-prefix-closed, and maps bijectively onto S 1 . Thus K is a semigroup 
Markov language for S 1 . 

Furthermore, if L is a robust semigroup Markov language, then so is K, 
since 1 is the unique shortest word representing the adjoined identity, and the 
natural lengths of elements in S over A and over A U {1} are equal. 
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Now let L be a semigroup Markov language for S over an alphabet B 
representing some generating set for S 1 . Now, B must be of the form A U {1}, 
where 1 represents the adjoined identity and A represents a generating set for 
S, since no product of elements of S equals the adjoined identity. 

Suppose some w G L contains the symbol 1. Then w = w'lw" and so w' 
and w'l represent the same element of S 1 , unless w' is the empty word, which 
is not a member of the semigroup Markov language L. So such a word w can 
only contain a single instance of the symbol 1 , and it must be the first symbol 
of w. (If L is a robust semigroup Markov language, the only such word is 
w = 1, since otherwise w'w" would be a shorter word representing w, as in 
the proof of Proposition 4.3.) 

Let 

K= ((L-{1}) -1A*) U{uG A+ : 1u G L}. 

Arguing as in the proof of Proposition 4. 1, it follows that K is +-prefix-closed, 
is regular, and contains a unique representative for each element of S. Thus K 
is a semigroup Markov language for S over the alphabet A. 

Furthermore, if L is a robust semigroup Markov language, the only word 
in L containing the symbol 1 is the word 1 itself, so in this case 

K = L-{1}. 

From these arguments, it follows that S is Markov if and only if S 1 is 
Markov and that S is robustly Markov if and only if S 1 is robustly Markov. 
From the arbitrary choice of generating sets, and the fact that any alphabet 
representing a generating set for S must be of the form A U {!}, where 1 
represents the adjoined identity and A represents a generating set for S, it 
follows that S is strongly Markov if and only if S 1 is strongly Markov. [TJij 

Proposition 13.2. Let S be a semigroup. Then: 

1. S is Markov if and only ifS is Markov. 

2. S is robustly Markov if and only ifS is robustly Markov. 

3. S is strongly Markov if and only ifS is strongly Markov. 

Proof of 13.2. By reasoning parallel to the proof Proposition 13.1, substituting 
for 1 and S° for S 1 as appropriate, it follows that if L is a [robust] Markov 
language for S, then L U {0} is a [robust] Markov language for L. 

Now let L be a Markov language for S° over an alphabet B representing 
some generating set for S°. Now, B must be of the form A U {0}, where 
represents the adjoined zero and A represents a generating set for S, since no 
product of elements of S equals the adjoined zero. 

Suppose some w G L contains the symbol 0, with w = w'Ow". Then 
w'O and w both represent the zero of the semigroup, which contradicts the 
uniqueness of representatives in L unless w" is the empty word. So such a 
word w can contain only a single symbol 0, and this must be the last letter of 
the word. (If L is a robust Markov language, the only such word is w = since 
this is the unique shortest word over A U {0} representing the adjoined zero.) 
Notice that there can only be one such word, since any other word containing 
the symbol would also represent the adjoined zero. So L contains a unique 
word w = w'O containing the symbol 0, and this word is not the prefix of any 
other word in L. 
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Let K = L — {w'O}. Then K is +-prefix-closed (since w'O is not a prefix 
of any other word in L), is regular, and contains a unique representative for 
each element of S. Finally, K C A + by the observation at the end of the last 
paragraph. Thus K is a Markov language for S over the alphabet A. 

From these arguments, it follows that S is Markov if and only if S° is 
Markov and that S is robustly Markov if and only if S° is robustly Markov. 
From the arbitrary choice of generating sets, and the fact that any alphabet 
representing a generating set for S° must be of the form Au{0}, it follows that 
S is strongly Markov if and only if S is strongly Markov. [13^] 



14 DIRECT PRODUCTS 

The class of Markov groups is closed under direct products, as 
a special case of the fact that an extension of one Markov group by another 
is also Markov [GdlHjoa, Proposition 10]. For monoids, the result is also 
positive: 

Theorem 14.1. 1. If M and N are Markov monoids, then M. x N is a Markov 
monoid. 

2.I/M and N are robust Markov monoids, then MxNisa robust Markov monoid. 

Proof of 14.1. 1. Let A and B be finite alphabets representing monoid gen- 
erating sets for M and N with representation maps c()a : A — > M and 
4)b : B — > N, respectively, and let K and L be monoid Markov languages 
over A and B for M and N, respectively. Then H = KL is prefix-closed, 
regular, and maps bijectively onto M x N under the representation map 
(j>:AUB— ^MxN defined by a ^ {a$A, In) and b h-> [Im^^b)- 

2. Proceed as in the previous part, but with K and L being robust Markov 
languages. Then KL is a robust Markov language for M x N since (with 
respect to the representation map cj)) AaubClTv) = Aa(u) + Ab(v) for all 
u € K and v € L. [14TI 

However, for semigroups the situation is obscure. First of all, a direct prod- 
uct of finitely generated semigroups is not necessarily finitely generated. For 
example, the direct product of two copies of the natural numbers N (excluding 
0) is not finitely generated. (Notice that N is strongly Markov.) Even when the 
direct product is finitely generated, the relationship of a finite generating set 
to the finite generating sets of the direct factors is complex; see the discussion 
in [RRW98, § 2]. It is possible to prove that a direct product of a Markov semi- 
group and a finite semigroup is Markov if it is finitely generated (Theorem 
14.2 below). The general idea of the proof is similar to that used by Campbell 
et al. to prove the analogous result for automatic semigroups [CRRToo, Theo- 
rem l.i(ii)], but more sophisticated reasoning is required here to ensure that 
prefix-closure and uniqueness are preserved. However, the issue of prefix- 
closure seems to make it impossible to adapt and strengthen the idea used by 
Campbell et al. for direct products of infinite semigroups. An entirely new 
approach may be required in this case. 

Theorem 14.2. Let S be a Markov semigroup and let T be finite. Then S x T is a 
Markov semigroup if and only if it is finitely generated. 
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Proof of 14-2. One direction of the result is trivial: if S x T is a Markov semi- 
group, then by definition it is finitely generated. 

Suppose that S x T is finitely generated. Then by [RRW98, Lemma 2.3], the 
finite semigroup T is such that T 2 = T. 

Since S is a Markov semigroup, it admits a Markov language L over some 
finite alphabet A representing a generating set for S. 

Let B be a finite alphabet in bijection with T. Since T 2 = T, it follows that, 
T n = T for all n € N and so for any t € T and n € N, there is word of length 
n over B representing t. Let 

R = {(u,v) :u,v G B+,|u| = |v|,u = v}; 

notice that R is a synchronous rational relation. Let (ZLex be the lexicographic 
ordering on B + based on some total ordering of B. Then 

R' = {u: (VvG A*)((u,v) eR^ uC Lex v)}. 

The language R' contains exactly one (lexicographically minimal) representa- 
tive of each length for each element of T. Furthermore, the language R' is 
+- prefix-closed, for if u is not CLex-mmimal amongst words of length |u[ rep- 
resenting u, then for any a € A, the word ua is not CLex-mmimal amongst 
words of length |ua| representing ua. 
Define 

00 

8 : \J (A n x B n ) 4(Ax B)* 

n=0 

(so that (u,v)6 is defined when u G A* and v € B* have equal length) by 
(a 1 a 2 ---a n ,b 1 b 2 ---b n ) ^ (ai,b 1 )(a 2 ,b 2 ) ■ ■ ■ (a n ,b n ), 

where at G A, b| G B. 

Let K = {(w, u) : w G L, u G R', |w| = |u|}. Then K5 is a regular language 
over A x B. Since both L and R' are +- prefix-closed, so is K6. 

Now let (s,t) G S x T. Then since L maps onto S, there is a word w G L 
with w = s. There is a word u' of length |w| over B such that u' = t. Let 
u be the CLex-minimal such word. Then |u| = |w| and so (w,u) G K and so 
(w,u)6 G K8 represents (s,t). So K5 maps onto S x T. 

Now suppose (w, u)5, (w' ,u')5 G K5 represent the same element of S x T. 
Then w = w' and u = u'. Since L is a Markov language for S, it maps 
bijectively onto S and sow = w'. In particular, |w| = \w'\, and so |u[ = |u'| 
by the definition of K. Since u = u' and ju| = |u'|, and R' contains exactly 
one representative of u of length |u|, it follows that u = u'. Hence (w,u)6 = 
(w^u'). Therefore K6 maps bijectively onto S x T. 

Thus K5 is a Markov language for S x T and so S x T is a Markov semigroup. 

Theorem 14.3. Let Sbe a robustly Markov semigroup and let T be finite. Then S x T 
is a robustly Markov semigroup if and only if it is finitely generated. 

Proof of 14.3. Proceed as in the proof of Theorem 14.3, with L being a ro- 
bust Markov language for S. Since AbU) = 1 for all t G T, it follows that 
A( AxB ) 6 (s, t) = Aa(s). So, by its construction, K5 is a robust Markov language 
for S x T. [T4T3] 

The corresponding result for being strongly Markov is still open: 
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Question 14.4. Let S be strong Markov and T finite. If S x T is finitely gener- 
ated, is it strongly Markov? 

We conjecture that the answer to this question is 'yes', but probably re- 
quires more complex reasoning than in the proofs of Theorems 14.2 and 14.3, 
because the generating set for S x T may not project onto T, which compli- 
cates the relationship between minimal lengths of representatives of elements 
of S x T and T. 

As remarked above, the following question is open: 

Question 14.5. Let S and T be Markov. If S x T is finitely generated, is it 
Markov? 

The following question also arises: 

Question 14.6. Is it true that whenever S x T is Markov, then both factors S 
and T are Markov? 

The answer to this question may shed light on the long-standing open 
question of whether direct factors of automatic groups, monoids, or semi- 
groups must themselves be automatic (see [ECH + 92, Open Question 4.1.2] 
and [CRRT01, Question 6.6]). 



15 free products 

Theorem 15.1. The class of Markov monoids is closed under forming (monoid) free 
products. 

Proof of 15.1. The proof for groups generalizes directly [GdlHcjoa, Proposi- 
tion 9]. [15TI 

Theorem 15.2. The class of Markov semigroups, the class of robustly Markov semi- 
groups, and the class of strongly Markov semigroups are all closed under forming 
(semigroup) free products. 

Proof of 15.2. Let S and T be Markov semigroups. Let K C A + and L C B + be 
semigroup Markov languages for S and T, respectively. Let 

M = (KL) + U (KL)*K U (LK) + U (LK)*K. 

Since the languages K and L are prefix-closed and regular, so is the language 
M. Any element of the free product S * T has a unique representation as an 
alternating product of elements of S and T. That is S * T is the disjoint union 
of 

Xt ={siti •••s n t n :s i €S,t i eT,neN}, 

X 2 = {si ti • • • s n t n s n+1 : Si G S, ti G T, n G N U {0}}, 

X 3 ={t lS i •••t a s n : si G S,ti G T,n G N}, 

X 4 = {ti si • • • t n s n t n+1 : Si G S, ti G T, n G N U {0}}. 

Since the languages K and L do not contain the empty word, every element of 
Xi (respectively X2, X3, X4) has a unique representative in (KL) + (respectively 
(KL)*K, (LK) + , (LK*K). So every element of S * T has a unique representative 
in M. So M is a Markov language for S * T. 
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Following the same reasoning with S and T being robustly Markov semi- 
groups and K and L being robust Markov languages shows that M is a robust 
Markov language for S * T, since, 

n 

AAUB(Slt-| • • • S n t n ) = Y_ ( A A(Si) + A B (ti)), 

and similarly for alternating products in X2 U X3 U X4. 

Finally suppose that S and T are strongly Markov semigroups. Let C be a 
finite alphabet representing a generating set for S * T. Since S * T is a semigroup 
free product, C contains subalphabets A and B representing generating sets 
for S and T respectively. Since S and T are strongly Markov semigroups, there 
exist robust Markov languages K C A + and L C B + for S and T respectively. 
Thus, by the preceding paragraph, M C (A U B) + C C + is a robust Markov 
language for S * T. Since C was arbitrary, S * T is strongly Markov. [151] 



l6 FINITE-INDEX EXTENSIONS AND SUBSEMIGROUPS 

Many properties of groups are known to be preserved under 
passing from groups to finite-index extensions and subgroups; for example, 
finite generation and presentability. For semigroups, the most well-known 
notion of index is the Rees index: if T is subsemigroup of a semigroup S, 
then T has finite index in S if S — T is finite. Many properties of semigroups 
are known to be preserved on passing to finite Rees index extensions and 
subsemigroups; for example, finite generation [RUS98, Theorem 1.1], finite 
presentability [RUS98, Theorem 1.3], and automaticity [HTR02, Theorem 1.1]. 
The following result fits this pattern: 

Theorem 16.1. The class of Markov semigroups is closed under forming finite Rees 
index extensions and subsemigroups. 

Proof of 16.1. Let S be a semigroup and let T be a finite Rees index subsemi- 
group of S. 

Suppose that T is Markov and that L is a Markov language for T over some 
finite alphabet A representing a generating set for T. Let B be an alphabet in 
bijection with S — T; then B is finite since T has finite Rees index in S. Without 
loss of generality, assume that B and A are disjoint. Then L U B is a Markov 
language for S. 

Now suppose that S admits a Markov language L over an alphabet A. 
Define 

L(A,T) ={w G A+ : w G T}. 

Let C be an alphabet of unique representatives for S — T. For any word w G 
A* — L(A, T), let w be the unique element of C U {e} representing w, or e if 

W = £. 

Define the alphabet 

D = {d p , Q)0 - : p, a G C U {e}, a G A, acr G L(A, T) A paa G L(A, T)}, 
and let it represent elements of T as follows: 

dp, a, a = paa. 
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Notice that if A is finite, D too must be finite. 

Let 01 C A + x D + be the relation consisting of pairs 

(w n+1 Q n W n Q n _iW n _i • • • a 2 W2<^] W] , d Wn+1 ;a;Wn d £; a ; Wn _ 1 ■ • • d £; a ;W2 d E; a ;Wl ) 

where the left-hand side lies in L(A, T) and the factorization of the left-hand 
side is obtained in the following way: start by letting the left-hand side be w' Q ; 
a partial factorization 

w( +1 ciiWi • • ■ ai Wi 

is complete if w{ +1 ^ L(A,T); if on the other hand w( +1 € L(A,T) set 
Qi+iwt + i to be the shortest suffix of w{ +1 lying in L(A, T) and let w( +2 be 
the remainder of w( , ^ . 

Notice that if (w, u) € 3? then W = u by the definition of how the alphabet 
D represents element of T, and that each word w determines a unique word 
u such that (w, u) G Jl. 

Lemma 16.2. The relation ft is rational. 

Proof of 16.2. It is easier to explain a how a two-tape finite state automaton 
A can recognize ft when reading from right-to-left; since the class of rational 
relations is closed under reversal, it will then follow that ft is rational. 

By the dual of [RT98, Theorem 4.3], S admits a left congruence A of finite 
index (that is, having finitely many equivalence classes) contained within (T x 
T) U As— T/ where As-t is the diagonal relation on S — T (that is, {(s, s) : s € 
S-T}). 

Imagine the automaton A reading letters from A from its left-hand input 
tape and outputting symbols from D on its right-hand tape. Suppose the con- 
tent of its left-hand tape is w. As it reads symbols from w (moving from right 
to left along the tape), it keeps track of the A-class of the element represented 
by the suffix of w read so far. (This is possible because A is a left congruence 
with only finitely many equivalence classes.) In particular, A knows whether 
the element represented by the suffix read so far lies in T (or equivalently 
whether the suffix read so far lies in L(A, T)), or, if the element so represented 
lies in S— T, which letter of Cu{e} represents it. When A reads a symbol a such 
that the suffix read so far — say aw' — lies in L(A, T), it non-deterministically 
chooses one of two actions: 

1. It outputs d £;Q) -H/, resets its store of the suffix read so far to e, and contin- 
ues to read from its left-hand tape. 

2. It outputs d c , a , w /, where c is a non-deterministically chosen element of 
C U {e}, then reads the remainder v of its left-hand tape and accepts if and 
only if v = c. (Notice that this is the only way that A can accept.) 

By induction on the subscripts of the letters at, the automaton A can ac- 
cept only by outputting letters d £;Q;Wi immediately after reading the suffix 
diWt • • • ai wi and the letter d Wn+1 ,a n ,w n immediately after reading a n w n • • • di W] , 
and can accept only when w n+ i ^ L(A, T). So A recognizes ft, reading from 
left-to-right. 1 16.2 1 

By Lemma 16.2, 

K = Loft = {u€D*:(3v€L)((u,v) eft)}. 
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is regular. Since the set of left-hand sides of elements of ft is L(A, T), the 
language K maps onto T. 

Suppose ui ,U2 G K are such that UT = H2. Let wi,W2 G L be such that 
(wi ,ui ), (w 2 ,u 2 ) G ft. Since L maps bijectively onto S and WT = UT = uj = 
W2, the words wi and y\>2 must be identical. Since every w G L(A, T) de- 
termines a unique u G D + with (w,u) G ft, it follows that ui and U2 are 
identical. So K maps bijectively onto T. 

Finally let u G K with |u| ^ 2. Then u = d Cn+u a n ,c n • • • d £ja2 , C2 d £iQ , )C ,, 
with n ^ 2. Then there is some word w G L with (w, u) G 3?. By the definition 
of ft, the word w factorizes as w n+ i a n w n • • • a.2W2ai wi G L with wt = Ci, 
and a 1 w 1 ,a 2 w 2 ,...,w Tl+1 a Tl w n G L(A, T). 

Since L is prefix-closed, w n+ i a n w n • • • a 2 w 2 G L. Since a 2 w 2 , . . . , w n+ i a u w n G 
L(A, T), it follows that w n+ i a n w n • • • a 2 w 2 G L(A, T). So, by the definition of 
ft, it follows that d Cn+1 , an( c n • • • d £ , Q2(C2 G K. 

This shows that K is closed under taking longest proper non-empty pre- 
fixes. By induction, K is +- prefix-closed. Hence K is a Markov language for 
T. |i6.i| 

However, the Rees index has the disadvantage that is does not generalize 
the group index. This motivated Gray & Ruskuc [GRo8] to develop the notion 
of Green index, which does generalize the group index. The definition and 
only the necessary properties of the Green index and related topics are given 
here; the reader is referred to [GRo8, § l] for further details. 

Definition 16.3. Let S be a semigroup and let T be a subsemigroup of S. The 
T-relative Green's relations ft T , L J , and IK T are defined on S as follows: for 
x,y G S, 



these are equivalence relations [GR08, § 1]. The T-relative ft T -, £ T -, and "K J - 
classes (that is, the equivalence classes of these relations) respect T, in the 
sense that each such class lies either wholly in T or wholly in S — T. 

The Green index of T in S is defined to be one more than the number of 
!K T -classes in S — T. 

Several properties are known to be preserved under passing to finite Green 
index extensions and subsemigroups, such as finite generation [CGR, Theo- 
rems 4.1 & 4.3], others are known to hold on passing to finite Green index 
subsemigroups and not on passing to finite Green index extensions, such as 
automaticity [CGR, Theorem 10.1 & Example 10.3]. The following example 
shows that neither the class of Markov semigroups nor the class of strongly 
Markov semigroups is not closed under finite Green index extensions. Indeed, 
a finite Green index extension of a strongly Markov semigroup need not be 
Markov: 

Example 16.4. Let G a finitely generated infinite torsion group. Let B be an 
alphabet representing a generating set for G. Let A be a finite alphabet in 
bijection with B. Let F be the free group with basis A. The bijection from 
A to B naturally extends to a surjective homomorphism (J) : F — > G. Let S 




34 



be the strong semilattice of groups S(F, G, 4>). (See [H0W95, §§ 4.1-4.2] for 
background on strong semilattices of groups.) 

The free group is hyperbolic and therefore strongly Markov. Moreover, F 
is a finite Green index subsemigroup of S, with S — F consisting of the single 
'Kf -class G. 

Suppose that S is Markov. Then by Proposition 5.3, S admits a regular lan- 
guage of unique normal forms L over the alphabet A U B. By the definition of 
multiplication in a strong semilattice of monoids, the words in L representing 
elements of G are precisely those that include at least one letter B. That is, the 
language of words in L representing elements of G is K = L — A*. Since L is 
regular, K is also. Since L maps bijectively onto S and K C L, it follows that K 
maps bijectively onto G. So if each letter a G A is interpreted as representing 
the element a<J) of G, then K is a regular language of unique normal forms 
for G. However, G, as a finitely generated infinite torsion group, does not ad- 
mit a regular language of unique normal forms by the reasoning in [ECH+92, 
Example 2.5.12]. This is a contradiction, and so S cannot be Markov. 

This example is similar in spirit to examples showing that neither the class 
of finitely presented semigroups nor the class of automatic semigroups is 
not closed under forming finite Green index extensions [CGR, Examples 6.5 
& 10.3]. However, with an extra condition on the Schiitzenberger groups of 
the T-relative ^{-classes in the complement, a positive result does hold. First 
of all, recall the definitions of Schiitzenberger groups: 

Definition 16.5. Retain notation from Definition 16.3. Let H be an "Kj. Let 
Stab(H) ={t £ T 1 : Fit = H} (the stabilizer of H in T), and define an equivalence 
cr(H) on Stab(H) by (x,y) G tf(H) if and only if hx = hy for all h G H. Then 
cr(Fi) is a congruence on Stab(H) and Stab(H)/cr(H) is a group, called the 
Schiitzenberger group of the IKy-class H and denoted F(H). 

Proposition 16.6. Let S be a semigroup and T a subsemigroup of S of finite Green 
index. Suppose that T is Markov and that the Schiitzenberger group of every T '-relative 
"K-class in S — T is Markov. Then S is Markov. 

Proof of 16.6. Let L be a semigroup Markov language for T over some finite 
alphabet A representing a generating set for T under the map <j) : A — > T. 
Since T has finite Green index in S, there are finitely many T-relative H-classes 
Hi , . . . , H n in S — T. By hypothesis, every Schiitzenberger group r(Ht) admits 
a semigroup Markov language Li over some finite alphabet Ai representing 
a generating set for r(Ht) under the map 4h : Ai — > F(Hi). For brevity, let 
^ = a(Ht). 

For each i = 1 , . . . , n, fix an element hi G Hi. For each t = 1 , . . . , n and 
a G A|, fix elements St ja G Stab(Hi) such that o4h = [Si ja ] CT .. 

Let A- be a new alphabet in bijection with At under the map ai : Ai — > 
A[. (Without loss of generality, assume that the alphabet A and the various 
alphabets Ai and A{ are pairwise disjoint.) Define a map i|h : Ai U A{ — > S as 
follows: 




Let 



L( = {(cuxi)u G A(A- : au G Li, a G Ai}. 
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(So L( is the language obtained from L| by taking each word in Li C and 
replacing its first letter with the corresponding letter from A(.) Notice that 
since Li is regular and +-prefix-closed, so is L{. 
Since F(Hi) acts regularly on Hi via 

x • [s] ffi = xs, 

it follows that for every y G Hi there is a unique element [s] CT . € F(Hi) such 
that hi • [s] at = U- Thus it follows from (16.1) and the fact that Li is a Markov 
language for F(H) that for every y G Hi there is a unique w € Li such that 
hi(wcf)i) = y. Hence, by (16.1) and the definition of L-, for every y G Hi there 
is a unique word v G L( with vvj^i = y. Thus L( maps bijectively onto Hi. 
Finally let 

TL 

K = LU JL{. 

Then K is +-prefix-closed and regular. Define 



i|> : A U (J (Ai U A{) -> S, ail; 



1=1 



a<$ if w € A, 

ai|>i if w G Ai U A{. 



Then cf) maps K bijectively onto S. Hence K is a semigroup Markov language 
for L . 1 16.6 1 

Proposition 16.6 parallels [CGR, Theorem 6.1], which shows that if T is 
a finite Green index subsemigroup of S, and T and all the Schiitzenberger 
groups of the T-relative CK-classes in S — T are finitely presented, then S is 
finitely presented. (As remarked above, without the condition on the finite 
presentability, this result does not hold.) This is in marked contrast to the 
situation for automatic groups: even if T and all the Schiitzenberger groups 
are automatic, S may not be automatic; see [CGR, Example 10.3]. 

Question 16.7. Let T be a subsemigroup of finite Green index in a semigroup 
S. Let also S be Markov. Is T Markov? 

Question 16.8. Is the property of being Markov preserved under passing to 
subsemigroups and extensions of finite Grigorchuk index for finitely gener- 
ated cancellative semigroups (so that both of the semigroups are finitely gen- 
erated)? 



17 THE CLASS OF MARKOV LANGUAGES 

This final section examines the class of languages that are Markov 
languages for some semigroup or monoid. First, notice that not every regular 
language is a Markov language: 

Example 17.1. Let L = a + U a + b. Suppose L is a Markov language for a 
semigroup S. Then b lies in S and so must be represented by an element of 
L. If b = a k for some k then ab = aa k = a k+1 . Since both ab and a k+1 
lie in L, this contradicts the uniqueness of representives in L. If, on the other 
hand, b = a k b for some k, then ab = aa k b = a k+1 b, again contradicting the 
uniqueness of representives in L. So L is not a semigroup Markov language. 

Indeed, if instead L' = L U {e} = a* U a + b, then the same contradictions 
show that L' is not a monoid Markov language. 
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Starting from a Markov language and adding or removing a finite num- 
ber of words can yield a prefix-closed regular language that is not a Markov 
language, as the following two examples show: 

Example 17.2. Let K = L' U {b} = a* U a*b, where L' is the language from 
Example 17.1. Then K is a Markov language for the semigroup presented 
by (a,b I (b 2 ,b), (ba,b)>. To see this, notice that ({a, b},{(b 2 , b), (ba,b)}) is a 
confluent noetherian rewriting system and its language of normal forms is K, 
and apply Proposition 7.1. Thus removing the single word b from the Markov 
language K yields the non-Markov language L'. 

Example 17.3. Let L = a* U {a 2 c, a 4 c}. Suppose L is a Markov language for a 
semigroup S. Then ac lies in S and so must be represented by an element of 
L. Now, if ac = a a , then a 2 c = a a+1 , contradicting the uniqueness of repre- 
sentatives in L. If ac = a 2 c, then ac = a 2 c = a 3 c = a 4 c, again contradicting 
the uniqueness of representatives in L. So ac = a 4 c. 

Now, a 3 c must also be represented by an element of L. If a 3 c = a a , 
then a 4 c = a a+1 , contradicting the uniqueness of representatives in L. If 
a 3 c = a 2 c, then a 2 c = a 3 c = a 4 c, again contradicting the uniqueness of rep- 
resentatives in L. So a 3 c = a 4 c, which, by the preceding paragraph, implies 
ac = a 3 c, which in turn implies a 2 c = a 4 c. This contradicts the uniqueness 
of representatives in L, and so L cannot be a Markov language. 

Thus adding the two words a 2 c and a 4 c to the Markov language a* yields 
the non-Markov language L. 

There are two main questions about the class of Markov languages: 

Question 17.4. Is there an algorithm that takes a regular language that is 
prefix-closed or +- prefix-closed and decides whether it is a Markov language 
for some monoid or semigroup? 

Question 17.5. Is every finite language that is prefix-closed or +-prefix-closed 
a Markov language for a (necessarily finite) monoid or semigroup? 
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